Introduction
Web data
Methods
Results
Conclusions
Because new digital activities are rarely—if ever—captured in official state data, researchers must rely on information gathered from alternative sources (Zook and McCanless 2022).
Guide policies for deployment of new technologies
Predictions of introduction times for future technologies (Meade and Islam 2021):
Network operators
Suppliers of network equipment
Regulatory authorities
As in temporal diffusion models, an S-shaped pattern in the cumulative level of adoption
A hierarchy effect: from main centres to secondary ones – central places
A neighborhood effect: diffusion proceeds outwards from innovation centres, first “hitting” nearby rather than far-away locations (Grubler 1990)
Hägerstrand (1965): from innovative centres (core) through a hierarchy of sub-centres, to the periphery
Diffusion of an intangible/digital technology
Map the active engagement with the digital
Over time, early stages of the internet
Granular and multi-scale spatial perspective
Data from the Internet Archive, the oldest web archive
Observe commercial websites 1996 - 2012 in the UK (.co.uk)
Geolocation: postcode references in the text
Timestamp: archival year
Counts
Data from a Web Archive – The Internet Archive
Observe commercial websites 1996 - 2012 in the UK (.co.uk)
Geolocation: postcode references in the text
Timestamp: archival year
Counts
JISC UK Web Domain Dataset: all archived webpages from the .uk domain 1996-2012
Curated by the British Library
Tranos, E., and C. Stich. 2020. Individual internet usage and the availability of online content of local interest: A multilevel approach. Computers, Environment and Urban Systems, 79:101371.
Tranos, E., T. Kitsos, and R. Ortega-Argilés. 2021. Digital economy in the UK: Regional productivity effects of early adoption. Regional Studies, 55:12, 1924-1938.
Stich, C., E. Tranos and M. Nathan. 2022. Modelling clusters from the ground up: a web data approach. Environment and Planning B, in press.
Tranos, E., A. C. Incera and G. Willis. 2022. Using the web to predict regional trade flows: data extraction, modelling, and validation, Annals of the AAG, in press.
All .uk archived webpages which contain a UK postcode in the web text
Circa 0.5 billion URLs with valid UK postcodes
20080509162138/http://www.website1.co.uk/contact_us IG8 8HD
All the archived .uk webpages
Archived during 1996-2012
Commercial webpages (.co.uk)
From webpages to websites:
- http://www.website1.co.uk/webpage1 and
- http://www.website1.co.uk/webpage2 are part of the
1 vs. multuple postcodes in a website
| level | freq | perc | cumfreq | cumperc |
|---|---|---|---|---|
| (0,1] | 41,596 | 0.718 | 41,596 | 0.718 |
| (1,2] | 6,451 | 0.111 | 48,047 | 0.830 |
| (2,10] | 6,163 | 0.106 | 54,210 | 0.936 |
| (10,100] | 2,975 | 0.051 | 57,185 | 0.988 |
| (100,1000] | 646 | 0.011 | 57,831 | 0.999 |
| (1000,10000] | 62 | 0.001 | 57,893 | 1.000 |
| (10000,100000] | 4 | 0.000 | 57,897 | 1.000 |
Websites with a large number of postcodes: e.g. directories, real estate websites
Focus on websites with one unique postcode per year
S-shaped pattern in the cumulative level of adoption
A hierarchy effect: from main centres to secondary ones
A neighborhood effect: first “hitting” nearby locations
Model cumulative adoption
Descriptive statistics, ESDA & density regressions
Modelling framework
Two scales:
Spatial heterogeneity
Not a clear, easy to explain pattern
Adoption heterogeneity
Different perceptions of risk and economic returns from new technologies
Early adopters vs. laggards, leapfrogging
Spatial heterogeneity
Expected volatility
Neighbourhood effect: diffusion proceeds outwards from innovation centers, first “hitting” nearby rather than far-away locations (Grubler 1990)
Spatial dependency (Moran’s I & LISA maps)
Website density regressions – distance effect
Websites per firm in Local authorities (c. 400)
Websites in Output Areas (c. 200,000)
\[Website\,Density_{i} = a + \beta Distance\,to\,Place_{i} + e_{i}\]
\(Website\,Density_{i}\):
Websites per firm in a Local Authority \(i\), or
Websites in an Output Area \(i\)
\[Website\,Density_{i} = a + \beta Distance\,to\,Place_{i} + e_{i}\]
\(Place\):
London, or
Nearest city, or
Nearest retail centre
\(\beta\) interpretation:
The lower the \(\beta\) is (or the larger the \(|\beta|\) is)…
… the larger urban gravitation is for web adoption.
Hierarchy effect: from main centers to secondary ones – central places
Almost perfect polarisation of web adoption in the early stages at a granular level
Polarisation decreases over time
More equally diffused at the Local Authority level
Random forests to predict \(Website\,Density_{i,t}\)
4 sets of models:
Space-time sensitive 10-fold CV (CAST)
| RMSE | RSquared | MAE | |
|---|---|---|---|
| Local Authorities | 0.022 | 0.908 | 0.012 |
| Output Areas | 3.613 | 0.627 | 0.533 |
Established technological diffusion drivers still apply
Geography matters: spatial dependency, urban gravitation
Some indications of a hierarchical diffusion
Granular analysis reveals patterns otherwise not visible
Stability and volatility: leapfrogging, early adopters dropping, but also stable positions
Spatially consistent mechanisms at local scale
Heterogeneity increases with resolution